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Talk Overview 


• Basics of machine learning vs. regression, 
interpreting MLMs 

• LTV and churn modeling 

• LTV vs. CaC 


Network models and adjusting/accounting for 
soda 
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Attribution & approaches, empirical benchmarks 
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Regression 


□ 


□ 


□ 


□ 


Y = bX + a 
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Machine learning and predictive 
models: power vs. understandability 


• A->B->C->D 45/50 times. Now A->B->C->? 

• Now you have 90% probability. Awesome. But . . . 

• So, do you need to understand "Why?" 


>3 


© 201 5, Ninja Metrics confidential information. 


Machine learning models 


• Tools: WEKA, SAS, SPSS; Spark MLLib, R 

• Varying levels of black boxyness 

• Rule-set (Jrip example) 

• Decision-tree 

• Support Vector Machines 
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Choosing the feature space 


• Huh? 

• Hello, "domain expert" 

• Feature selection 


• Why bother with the domain experts? 
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Rule set (JRIP r FOIL, others) 


• How do you read these? 

• Mutually exclusive rules 

• Coverage numbers: how many cases does it apply to? How many cases does it get right? (XX/ 
XX) 


• Interpretation of the meaning, somewhat like regression in that you look at coefficients, but 
mostly like interaction effects rather than betas. 

• Then, sometimes, actionability: requires a medium to high level of abstraction so they can be 
interpreted and acted upon. You need a person who gets the math and the context. 


• Rule examples from a rejected JRIP model that was only about 67% accuracy: 

- (account_age <= 21 ) => ischurner=1 (23.1 6% / 70.63%) 

- (SOCIAL_VALUE <= 0) and (account_age >= 28) and (account_age <= 31) => ischurner=1 (0.86% / 64.84%) 
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(account_age <= 123) and (SOCIAL_VALUE <= 0.000653) and (account_age <= 93) and (account_age >= 
68) and (NUM_XXX <= 0) => ischurner=1 (4.64% / 51 .83%) 



Rule-based logic 










Decision Trees 

account_age <= 36 
| INFLUENCEABILITY <= 0 
| | NUMJNVITE <=0: 1 (34763.0/10892.0) 

| | NUMJNVITE >0:0(56.0/8.0) 

| INFLUENCEABILITY > 0: 0 (170.0/43.0) 

account_age > 36 

| INFLUENCEABILITY <=0.13 

| | NUMJNVITE <= 0 

I I I NUM_ xxxxx <= 0 

jj|| account_age <= 94 

| | | | | NUM GIVE CURRENCY <= 0 

| | | | | | account_age <=88:0(10511.0/4826.0) 

| | | | | | account_age >88: 1 (2584.0/1222.0) 

| | | | | NUM_GIVE_CURRENCY >0:0(112.0/26.0) 

||j| account_age >94:0(78164.0/25158.0) 

| | | NUM_xxxxx > 0: 1 (38.0/8.0) 

| | NUMJNVITE >0:0(1259.0/105.0) 

| INFLUENCEABILITY >0.13:0(1373.0/113.0) 


• Chum prediction 
using decision trees 


• Follow from root 
node all the way to 
a leaf for a 
corresponding rule 




information. 
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Churn prediction using decision trees 

Follow from root node all the way to a leaf 
for a corresponding rule 



Support Vector Machines 


+ -2.1931 * (normalized) 

+ -3.7646 * (normalized) 

+ -0.1759 * (normalized) 

+ -2.0108 * (normalized) 

+ -1.234 * (normalized) 

+ -1.909 * (normalized) 

+ -1.909 * (normalized) 

+ -5.2997 * (normalized) 

+ -6.0633 * (normalized) 

+ 1 .6118 * (normalized) 

+ 1 .0722 * (normalized) 

+ -1.8388 * (normalized) 

+ -2.5029 * (normalized) 

+ 2.5578 


account_age 
number_transactions 
days_inactive_spending 
different_transactions 
N U M_g i ve_cu r re n cy 
NUM_Recruited 
NUM_invite_to_play 
N U M Joi nt_viewi ng 
N U M_played_with 
NUM_XXXXXX 
ASOCIAL_VALUE 
SOCIAL_VALUE 
INFLUENCEABILITY 


• Attribute 
weights from a 
support vector 
machine model 



Looking for patterns 

• Are you trying to simply get the best model? 

• Are you trying to answer "why?" 

• These were three models of the same 
population. What were the patterns? 



Conclusion: people are compelling 




Conclusion: people are compelling 



The black box factor 
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The black box factor 


• "Deep learning" neural networks 


• Used heavily by FB and Google, e.g. voice recognition 
and image understanding (self-driving cars 
recognizing the environment) 


Zero actionability possible, but most accurate by far 


* 


There is no output, no model— just a bunch of 
relationships like the brain's neuron pathways 


LTV modeling 

• Two components: LT 
and V 

• LT models: TTL/Churn. 

• Cox/Hazard model 

• Note the inverse nature 
of retention and churn 
approaches 


Funnel: Facebook Ad #68 


Step 

100 % 

Retained from previous step 

# Of Users 

Time To Step 

Facebook Ad #68 
(CRM/Advertising) 

70% 


34587 

na 

Tutorials 

(Levels) 

30% 


24211 

3 min, 45 sec 

Group Invite 
(Social Events/Group 
Events) 

Bought any item 

(Monetization/ln-app 

Purchase) 

34 % 


7263 

12 hours, 32 min, 41 
sec 



2470 

15 min, 27 sec 
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Value models 

Social interactions impacting models 
Historical or predictive use by your team? 



LTV vs. CaC 


• What do these acronyms mean, and why is this the most important 
equation in gaming? 

• Cost of Customer Acquisition. Also CPI cost per nstall. 


• How do you measure return on investment (ROI)? 


• Revenue/ARPU/ARPPU must be tied back to acquisition source- 
reinforcing importance of good attribution data. Use of revenue to set 
RTB pricing 
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Complication from the CFO in currency-based games: Revenue 
recognized at purchase or exercise? 

Can you trust the numbers? Not exactly, no. 
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Attribution: Early days 


• Overview: Programmatic vs. brand sourcing, RTB systems, ad sources and publishers, examples 

• What is attribution? Big picture, big deal, it's fixing advertising. 

• Tracking sources. Appsflyer, AdX (going away), Adjust, Kochava, TUNE (Formerly HasOffers). Example: 

{"timestamp":"201 5-02-1 7T23:59:59.000Z", "data": 
{"accountJd":"38897195XXX" f "traffic_source_type":"Blind Ferret 

Media","type":"59","traffic_source":"PC_1_1_blif_250_ios_both_CPI_worldwide"}> 


* 


By 201 7: Advertisers will spend $1 74bn online, despite imperfect practices (Magna Global) 

54% of businesses use some form of attribution, yet 58% think perfect attribution is impossible (Adobe) 
38% of those who use it, do so manually (ouch) 

Multi-touch data: not on everyone's radar, but should be 


Attribution: Early days 


H \/ analytics engine 

3 Katana, 






Q Toggle ToolTips 

Ninja Customer 


Traffic o 

CHOOSE APPUCATION/GAME ▼ 






Traffic Sources O 





H BOARD 

> 









SitoyCroativo A 

Publisher A 

Users ^ Population ^ 

Conversion ^ 

Revenue ^ 

ARPU $ 

Z METRICS 

V/ 

Average 


27191.14 7.14% 

5.36% 

$225,084.06 

$7.99 

r 


Totals 


380676 100.00% 

“ 

$3,151,176.81 

- 



Spring blitz A 

Direct Marketing 

4518 1.19% 

0.05% 

$47,367.62 

$5.89 



Spring blitz B 

Direct Marketing 

25167 6.61% 

1.56% 

$303,725.42 

$6.78 

jmmerc© 


March 1 Flowers 

Email campaign 

2371 0.62% 

2.56% 

$23,929.55 

$5.67 



March 12 Tanks 

Email campaign 

9853 2.59% 

3.45% 

$143,989.77 

$8.21 


\ 

March 17 Chaos 

Email campaign 

43167 11.34% 

1.30% 

$350,377.91 

$4.56 



FB AD with child #1 

Face book Ads 

12897 3.39% 

1.25% 

$89,503.92 

$1.45 

JELS 

> 

FB_Ad #1 

Facebook Ads 

12897 3.39% 

6.12% 

$58,769.05 

$2.56 



FB_Ad #2 

Face book Ads 

11567 3.04% 

3.24% 

$521,731.85 

$25.34 

■EM METRICS 

> 

FB_Ad #3 

Facebook Ads 

99876 26.24% 

1.25% 

$632,894.24 

$3.56 


FB_Ad #4 

Facebook Ads 

23481 6.17% 

1.02% 

$62,694.27 

$1.50 


* METRICS 


Segmentation and AB testing Account & Support 0 Logo 


Time 


LAST 3 MONTHS * 




SELECT COLUMNS TO SHOW 


EXPORT 


ARPPU $ True Value Adj. $ Adj. Revenue $ Adj. ARPU $ Adj. ARPPU $ 
$14.23 


$10.48 

102% A 

$48,314.97 

$6.01 

$10.69 

$12.07 

78% T 

$236,905.83 

$5.29 

$9.41 

$10.09 

118% A 

$28,236.87 

$6.69 

$11.91 

$14.61 

134% A 

$192,946.29 

$11.00 

$19.58 

$8.12 

50% T 

$175,188.96 

$2.28 

$4.06 

$2.58 

60% ▼ 

$53,702.35 

$0.87 

$1.55 

$4.56 

176% A 

$103,433.53 

$4.51 

$8.03 

$45.11 

76% ▼ 

$396,516.21 

$19.26 

$34.28 

$6.34 

156% A 

$987,315.01 

$5.55 

$9.89 

$2.67 

134% A 

$84,010.32 

$2.01 

$3.58 




Decent: Last click 


20% of advertisers 
rely on this (TagMan). 
Why? Simplest. 

(Graphs, Marin Software) 
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100 % 


90 % 


80 % 


70 % 

“O 

0) 

c 

60 % 

</> 

50 % 

•O 

05 

40 % 

CJ 

30 % 


20 % 


10 % 


0 % 


First Click Second Click Third Click 


Last Click 




Decent: First click 


41 % of agencies and 
24% of brand 
managers use it. 

Why? 

Awareness generator 
theory. 
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First Click Second Click Third Click 


Last Click 



Decent: Linear 


Throwing stuff 
at the wall 
here . . . 


60% 

50% 

■a 

I 40% 

00 

I 30% 

1 20 % 

10 % 

0 % 

First Click Second Click Third Click Last Click 
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Better: Time Decay 


Starting to build in some theory 
about process and 
cognition. 


May overvalue 
last click. 
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u 
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00 

00 


CD 

CJ 


60 % 

50 % 

40 % 

30 % 

20 % 

10 % 

0 % 


First Click Second Click Third Click 


Last Click 


Better: Position-based 


Doesn't over- or 
under-value first or 
last, but the values 
are ultimately 
arbitrary. 


.dp 

00 

00 


03 

k— 

CD 


60 % 

50 % 

40 % 

30 % 

20 % 

10 % 

0 % 


First Click Second Click Third Click 


Last Click 



© 201 5, Ninja Metrics confidential information. 


Best: Data-driven & modeled 


Rather than using a theory or intuition, we rely solely on 
observed patterns. 




.ap 

00 

00 


a> 

Lb 

o 


60 % 

50 % 

40 % 

30 % 

20 % 

10 % 

0 % 


First Click Second Click Third Click 


Last Click 



Data-driven attribution models 


• Let Z = installation, and A,B,C,D, ... be other events. _ event means 
the sequence ended. 


1. ABCDZ 

2. ABCZ_ 

3. BCDZ_ 

4. BCZ_ 

5. ABC_ 

6. ACDB_ 

7. BC 
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Data-driven attribution models 


• Let Z = installation, and A,B,C,D, ... be other events. _ event means the sequence ended. 

1. ABCDZ_ 

2. ABCZ_ 

3. BCDZ 

4. BCZ_ 

5. ABC_ 

6. ACDB_ 

7. BC_ 

• Path 2 vs. Path 4: Isolates “A” 


- Example: Sequence 2 leads to a 20% install rate 

- Example: Sequence 4 leads to a 15% install rate 

• Conclusion: Ad A has an incremental effect of 5%, when sequenced. (May be different solo, but 
we can have a sequence for that as well). 
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Network models 


• Who cares? 


• Origin: Improvements in F-scores in IARPA 
project 


• Cross-sectional (centrality, e.g.) vs. dynamic, 
causal, over-time 
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Social Network Analysis 
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Do Network Forces Matter? 


• Ye gods, more than we thought, yes. 

• Major benefits: improved models, uncovering new 
dynamics, associations with product/mechanics. 


Benchmark: 10-70% of play is purely network-driven 

Benchmark: 6-60% of spending is purely network- 
driven. 



General report statistics 


Data size: 365m 
accounts, 2013-present 

Accuracy rate: 85% 
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Drop in Neighbors' Spending 


$ 236.86 
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£ASPORTS 


PC Hardcore Multiplayer 

Average is 30% 


MMOs 

Average is 60% 




Adjustments by Geo f Channel, Ad 


• Minimum 5,000 accounts 
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Most influential players, global 


Laos: +2,558% 
Palestine: +2,331% 


Cambodia: +945% 


Sudan: +840% 
Iran: +672% 



Algeria: +2,558% 
Ukraine: +2,331% 


Belarus: +945% 


Pakistan: +840% 
Syria: +672% 






Least influential players, global 



Kenya: -57% 
Iceland: -34% 
Norway: -26% 
Switzerland: -25% 




Australia: -24% 
USA: -24% 
Japan: -23% 



U.K.: -23% 


South Africa: 
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The most social users, by acquisition source 


1. 

+1 93% 

2. 

+110% 

3. 

+1 04% 

4. 

+62% 

5. 

+38.7% 


mi 


Other notables: 

Organic, +14% 
Lowest: -27% 
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What about creative? 


• We did not report on creative, and they matter 
even more: 

Creative A vs. B, same channel 
High: +900% (second was +310%) 

Low: -70% 




Variance: 131% 






Dmitri Williams, CEO 

dmitri@ninjametrics.com 
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